Content and Link Structure Analysis for Searching the Web
نویسندگان
چکیده
Automated search engines continuously discover, index, and store information about web pages. When a user issues a query, this repository is searched to find a result set of most relevant pages. An ideal search scheme must satisfy two basic requirements: high recall, and high precision. Recall measures the ability of an algorithm to find as many relevant pages as possible. Precision measures the ability of an algorithm to reject as many nonrelevant pages as possible. An ideal search algorithm should find all of the relevant pages, rank them by relevance to the user query, and present a rank-ordered result to the user. The earlier generations of search engines relied solely on keyword matching to perform the search. Unfortunately this approach didn’t work very well. Too many nonrelevant pages were returned along with relevant ones, and their rankings rarely agreed with users’ interests. Since user queries are short, usually consist of 2-3 words, the problems associated with synonyny and polysemy make it particularly difficult to evaluate which pages will be of interest to a user. The user is more likely to be interested in a page if it contains authoritative information on its subject and it is relevant to the user query. Authoritative pages are usually cited by others frequently, and the link
منابع مشابه
Shear-Flexural Interaction in Analysis of Reduced Web Section Beams using VM Link Element
Reduced web section beams in shear-yielding moment-resistant steel frames are used for energy dissipating of earthquakes. The finite element analysis indicates that failure mode of these beams are governed by the combination of shear force and flexural moment. Therefore the analysis of frames with reduced web section beams needs consideration of shear-flexural interaction in those sections. In ...
متن کاملSearching for web communities through site level link and content analysis
In recent years, heavy user growth in Web 2.0 applications such as blogs and online social networks has helped support studies into classifying various web communities. Researchers have made many interesting finds, e.g. identifying subgroups within large "blogospheres." However, there has been less focus starting from a small community of sites to find other sites that are potentially part of t...
متن کاملLexical and semantic clustering by Web links
Recent Web searching and mining tools are combining text and link analysis to improve ranking and crawling algorithms. The central assumption behind such approaches is that there is a correlation between the graph structure of the Web and the text and meaning of pages. Here I formalize and quantitatively validate two conjectures drawing connections from linkage information to lexical and semant...
متن کاملA Survey Paper of Structure Mining Technique using Clustering and Ranking Algorithm
A survey of various link analysis and clustering algorithms such as Page Rank, Hyperlink-Induced Topic Search, Weighted Page Rank based on Visit of Links K-Means, Fuzzy K-Means. Ranking algorithms illustrated, Weighted Page Rank is more efficient than Hyperlink-induced Topic Search Whereas clustering algorithms has described Fuzzy Soft, Rough K-Means is a mixture of Rough K-Means and fuzzy soft...
متن کاملThe Content and Structure of Electronic Personal Health Records: A Systematic Review
Introduction: The electronic Personal Health Record (ePHR) improves people’s awareness and care management and leads to health promotion. One of the most important factors that contributes to the development of ePHR is identifying and understanding its content and structure. No comprehensive studies have so far been performed on the content and structure of ePHRs. Therefore, the purpose of this...
متن کاملThe Content and Structure of Electronic Personal Health Records: A Systematic Review
Introduction: The electronic Personal Health Record (ePHR) improves people’s awareness and care management and leads to health promotion. One of the most important factors that contributes to the development of ePHR is identifying and understanding its content and structure. No comprehensive studies have so far been performed on the content and structure of ePHRs. Therefore, the purpose of this...
متن کامل